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ABSTRACT 

We calculate the non-linear galaxy power spectrum in real space, including non-linear distortion 
of the Baryon Acoustic Oscillations, using the standard 3rd-order perturbation theory (PT). The 
calculation is based upon the assumption that the number density of galaxies is a local function 
of the underlying, non-linear density field. The galaxy bias is allowed to be both non-linear and 
stochastic. We show that the PT calculation agrees with the galaxy power spectrum estimated from 
the Millennium Simulation, in the weakly non-linear regime (defined by the matter power spectrum) 
at high rcdshifts, 1 < z < 6. We also show that, once 3 free parameters characterizing galaxy bias 
are marginalized over, the PT power spectrum fit to the Millennium Simulation data yields unbiased 
estimates of the distance scale, D, to within the statistical error. This distance scale corresponds 
to the angular diameter distance, Da{z), and the expansion rate, H{z), in real galaxy surveys. Our 
results presented in this paper are still restricted to real space. The future work should include the 
effects of non-linear redshift space distortion. Nevertheless, our results indicate that non-linear galaxy 
bias in the weakly non-linear regime at high redshifts is reasonably under control. 
Subject headings: cosmology : theory — large-scale structure of universe 



1. INTRODUCTION 

Surveys of galaxies are the oldest way of mapping cos- 
mological fluctuations. Over the last three decades they 
have been used for measuring cosmological parameters, 
such as the matter density of the universe, Vim (see Pee- 
bles 1993, for a review). 

The galaxy surveys are largely complementary to 
CMB, as they allow us to determine the important cos- 
mological parameters that remain poorly constrained by 
the CMB data alone (e.g., Takada et al. 2006): e.g., the 
mass of neutrinos, the shape of the primordial power 
spectrum, and the properties of dark energy. 

The latest data sets. Two Degree Field Galaxy Red- 
shift Survey (2dFGRS, Cole et al. 2005) and Sloan Dig- 
ital Sky Survey (SDSS, Tegmark et al. 2006), have en- 
abled us to determine most of the cosmological param- 
eters to better than 5% accuracy, when combined with 
the Cosmic Microwave Background (CMB) data from the 
Wilkinson Microwave Anisotropy Probe (Bennett et al. 
2003; Spergel et al. 2003, 2007; Hinshaw et al. 2008; 
Dunkley et al. 2008; Komatsu et al. 2008). 

The galaxy power spectrum, the Fourier transform of 
the galaxy two point correlation function, has been used 
widely for extracting cosmological information from the 
galaxy survey data. The amplitude, overall shape, as 
well as oscillatory features (called the Baryon Acoustic 
Oscillations, or BAOs) contain a wealth of cosmologi- 
cal information (see Weinberg 2008, for a recent review). 
In order to extract this information correctly, we must 
understand how the observed galaxy power spectra are 
related to the underlying cosmological models. 

How do we model the galaxy power spectrum? We 
may use the cosmological perturbation theory (PT). The 
accuracy of the linear PT has been verified observation- 
ally by the temperature and polarization data of CMB 
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measured by WMAP (Hinshaw et al. 2003, 2007; Kogut 
et al. 2003; Page et al. 2007; Nolta et al. 2008). How- 
ever, we cannot use the linear PT for the galaxy power 
spectrum, as the matter density field grows non-linearly 
due to gravitational instability. One must therefore use 
the non-linear PT. 
There are three sources of non-linearities: 

(1) Non-linear evolution of the underlying matter den- 
sity field, which alters the matter power spectrum 
away from the linear prediction. 

(2) Non-linear galaxy bias, or non-linear mapping be- 
tween the underlying matter density field and the 
distribution of collapsed objects such as dark mat- 
ter halos and galaxies, which alters the galaxy 
power spectrum away from the matter power spec- 
trum. 

(3) Non-linear redshift space distortion, which arises as 
the observed redshifts of galaxies used for measur- 
ing locations of galaxies along the line of sight con- 
tain both the Hubble expansion and the peculiar 
velocity of galaxies. This leads to the systematic 
shifts in the line-of-sight positions of galaxies, al- 
tering the galaxy power spectrum in redshift space 
away from that in real space. 

Using the 3rd-order PT (see Bernardeau et al. 2002, for 
a review) we have shown that the first effect can be mod- 
eled accurately in the weakly non-linear regime (Jeong & 
Komatsu 2006, hereafter Paper I). In this paper we ad- 
dress the second effect, the non-linear galaxy bias, using 
the 3rd-order PT. We will address the third effect, the 
non-linear redshift space distortion, in the future work. 

Our study is motivated by recently proposed high 
redshift galaxy surveys such as Cosmic Inflation Probe 
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(CIP)^, Hobby-Eberly Dark Energy Experiment (HET- 
DEX; Hill ct al. 2004), Baryon Oscillation Spectroscopic 
Survey (BOSS)^, and Wide-field Fiber-fed Multi Object 
Spectrograph survey (WFMOS; Glazebrook et al. 2005), 
to mention a few. These proposed surveys will observe 
the galaxy power spectra to the unprecedented preci- 
sion, which demands the precision modeling of the galaxy 
power spectrum at 1% accuracy or better. 

Over the last decade, the non-linear PT, including 
modeling of non-linear galaxy power spectra, had been 
studied actively (see Bernardeau et al. 2002, for a re- 
view). However, PT had never been applied to the real 
data such as 2dFGRS or SDSS, as non-linearities arc too 
strong for PT to be valid at low rcdshifts, z < I (e.g., 
Meiksin et al. 1999). At high redshifts, i.e., z > 1, how- 
ever, PT is expected to perform better because of weaker 
non-linearity. In Paper I we have shown that the matter 
power spectrum computed from the 3rd-order PT de- 
scribes that from iV-body simulations accurately.^ 

But, what about the galaxy power spectrum? One may 
generally expect that, since non-linearities were milder in 
a high-z universe, there should be a plenty of room for 
PT to be a good approximation. On the other hand, 
galaxies were more highly biased at higher redshifts for a 
given mass, and therefore one might suspect, somewhat 
naively, that non-linear bias could compromise the suc- 
cess of PT. In this paper we shall show that is not the 
case, and PT does provide a good approximation to the 
galaxy power spectrum at high redshifts. 

This paper is organized as follows. In § 2 we give the 
formula for the 3rd-order PT galaxy power spectrum. In 
§ 3 we compare the 3rd-order PT matter power spec- 
trum with the matter power spectrum estimated from 
the Millennium Simulation (Springel ct al. 2005), in or- 
der to confirm our previous results (Paper I) with the 
Millennium Simulation. In § 4 we show that the PT cal- 
culation of the galaxy power spectrum agrees with the 
galaxy power spectrum estimated from the Millennium 
Simulation in the weakly non-linear regime (defined by 
the matter power spectrum) at high redshifts, 1 < 2; < 6. 
In § 5 we extract the distance scale from the Millennium 
Simulation, which is related to the angular diameter dis- 
tance and the expansion rate of the universe in real sur- 
veys. In § 6 we give discussion and conclusions. 

2. NON-LINEAR GALAXY POWER SPECTRUM 
FROM PERTURBATION THEORY 

2.1. Locality Assumption 

Galaxies are biased tracers of the underlying density 
field (Kaiser 1984), which implies that the distribution of 
galaxies depends on the underlying matter density fluc- 
tuations in a complex way. This relation must depend 
upon the detailed galaxy formation processes, which are 
not yet understood completely. 

However, on large enough scales, one may approximate 
this function as a local function of the underlying den- 
sity fluctuations, i.e., the number density of galaxies at 
a given position in the universe is given solely by the un- 
derlying matter density at the same position. With this 

^ http;//cfa-www. harvard.edu/cip 
^ http:/ /howdy.physics. nyu.edu/index.php/BOSS 
^ See also Jain & Bertschinger (1994) for the earlier, pioneering 
work. 



approximation, one may expand the density fluctuations 
of galaxies, 6g, in terms of the underlying matter den- 
sity fluctuations, as (Fry & Gaztanaga 1993; McDonald 
2006) 

<5g(x) = e + 6i(5(x) + ^b2S^{^) + ^h6^{^) + ..., (1) 

where 6„ are the galaxy bias parameters, and e is a 
random variable that represents the "stochasticity" of 
the galaxy bias, i.e., the relation between (5g(x) and 
5(x) is not deterministic, but contains some noise (e.g., 
Yoshikawa et al. 2001, and references therein). We as- 
sume that the stochasticity is white noise, and is uncorre- 
lated with the density fluctuations, i.e., (e^) = 0. While 
both of these assumptions should be violated at some 
small scales, we assume that these are valid assumptions 
on the scales that wc are interested in - namely, on the 
scales where the 3rd-ordcr PT describes the non-linear 
matter power spectrum with 1% accuracy. Since both 
bias parameters and stochasticity evolve in time (Fry 
1996; Tcgmark & Peebles 1998), we allow them to de- 
pend on redshifts. 

One obtains the traditional "linear bias model" when 
the Taylor series expansion given in Eq. (1) is truncated 
at the first order and the stochasticity is ignored. 

The precise values of the galaxy bias parameters de- 
pend on the galaxy formation processes, and different 
types of galaxies have different galaxy bias parameters. 
However, we are not interested in the precise values of the 
galaxy bias parameters, but only interested in extracting 
cosmological parameters from the observed galaxy power 
spectra with all the bias parameters marginalized over. 

2.2. 3rd-order PT galaxy power spectrum 

The analysis in this paper adopts the framework of 
McDonald (2006), and we briefly summarize the result 
for clarity. We shall use the 3rd-order PT; thus, we shall 
keep the terms up to the 3rd order in 5. The resulting 
power spectrum can be written in terms of the linear 
matter power spectrum, PL{k), and the 3rd order matter 
power spectrum, Pss{k), as 



Pg{k) =Po + bl Pss{k) + 62^2 (fc) + 6iPfe22(fc) 



(2) 



where Pf,2 and Pi>,22 are given by 
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and 



-P622 = 
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respectively, with F2 given by 



i2ny 

(2) 



-PLiq) 



Pi(|k-q|)-P(g) 



i^i'^(qi,q2) = ^ + ^ 



5 2 (qi • q2)^ q2 • q2 



+ 



1 



2 + ^2 
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We use the standard formula for Pgg (see Eq. (14) of 

Paper I and references therein). Here, bi, 62, and Pq are 
the non- linear bias parameters'*, which are given in terms 

These parameters correspond to 61, 62, and A'' in the original 
paper by McDonald (2006). 
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of the original coefficients for the Taylor expansion as 

68 

bj = bl + bib3(T'^ + —bib2a'^, 

h='^, (3) 
bi 

Po = {e') + lbll'^Pl{k), 

where a is the r.m.s. of density fluctuations. 

We will never have to deal with the original coeffi- 
cients, bi, 62, ^3, or e.^ Instead, we will only use the 
re-parametrized bias parameters, 61, &2, and Pq, as these 
are related more directly to the observables. As shown 
by McDonald (2006), in the large-scale limit. A; ^ 0, one 
finds 

Pg{k)^Po + blPLik). (4) 

Therefore, in the large-scale limit one recovers the tra- 
ditional linear bias model plus the constant term. Note 
that bi is the same as what is called the "effective bias" 
in Heavens et al. (1998). 

Throughout this paper we shall use Eq. (2) for calcu- 
lating the non-linear galaxy power spectra. 

2.3. Why we do not care about the precise values of 
bias parameters 

The precise values of the galaxy bias parameters de- 
pend on the details of the galaxy formation and evolu- 
tion, as well as on galaxy types, luminosities, and so on. 

However, our goal is to extract the cosmological infor- 
mation from the observed galaxy power spectra, without 
having to worry about which galaxies we are using as 
tracers of the underlying density field. 

Therefore, we will marginalize the likelihood function 
over the bias parameters, without ever paying attention 
to their precise values. Is this approach sensible? 

One might hope that one should be able to calculate 
the bias parameters for given properties of galaxies from 
the first principles using, e.g., sophisticated numerical 
simulations. 

Less numerically expensive way of doing the same thing 
would be to use the semi-analytical halo model approach, 
calibrated with a smaller set of numerical simulations 
(see Cooray & Sheth 2002, for a review). Using the peak- 
backgroimd split method (Sheth & Tormen 1999) based 
upon the excursion set approach (Bond et al. 1991), one 
can calculate 61, 62, &3, etc., the coefficients of the Tay- 
lor series expansion given in Eq. (1), for the density of 
dark matter halos. Once the bias parameters for dark 
matter halos arc specified, the galaxy bias parameters 
may be calculated rising the so-called Halo Occupation 
Distribution (HOD) (Seljak 2000). 

Smith et al. (2007) have attempted this approach, and 
shown that it is difficult to calculate even the power spec- 
trum of dark matter halos that matches A^-body simu- 
lations. The halo-model predictions for bias parameters 
are not yet accurate enough, and we do not yet have a 
correct model for Pq. 

The situation would be even worse for the galaxy power 
spectrum, as we would have to model the HOD in addi- 
tion to the halo bias. At the moment the form of HOD is 

^ For the expression of Pg{k) with the original coefRcients, see 
Heavens et al. (1998); Smith et al. (2007). 



basically a free empirical function. We therefore feel that 
it is dangerous to rely on our limited understanding of 
these complications for computing the bias parameters. 

This is the reason why we have decided to give up 
predicting the precise values of bias parameters entirely. 
Instead, we shall treat 3 bias parameters, 61, 62, and Pq, 
as free parameters, and fit them to the observed galaxy 
power spectra simultaneously with the cosmological pa- 
rameters. 

The most important question that we must ask is the 
following, "using the 3rd-order PT with 3 bias parame- 
ters, can we extract the correct cosmological parameters 
from the galaxy power spectra?" If the answer is yes, we 
will not have to worry about the precise values of bias 
parameters anymore. 

3. DARK MATTER POWER SPECTRUM FROM 
MILLENNIUM SIMULATION 

In this section we show that the matter power spectrum 
computed from the 3rd-order PT agrees with that esti- 
mated from the Millennium Simulation (Springel et al. 
2005). This result confirms our previous finding (Paper 

I)- . 

Using the result obtained in this section we define the 
maximum wavenumber, kmax, below which the 3rd-order 
PT may be trusted. The matter power spectrum gives 
an unambiguous definition of kmax, which will then be 
used thereafter when we analyze power spectra of halos 
and galaxies in § 4. 

3.1. Millennium Simulation 

The Millennium simulation (Springel et al. 2005) 
is a large A'^-body simulation with the box size of 
(500 Mpc//i)3 and 21603 dark matter particles. The 
cosmological parameters used in the simulation are 
indm,nb,^A,h) = (0.205, 0.045, 0.75, 0.73). ^ 

The primordial power spectrum used in the simulation 
is the scale-invariant Peebles-Harrison-Zel'dovich spec- 
trum, Hs = 1.0, and the linear r.m.s. density fluctua- 
tion smoothed with a top-hat filter of radius 8 /i~^Mpc 
is (78 = 0.9. Note that these values are significantly 
larger than the latest values found from the WMAP 5- 
year data, as — 0.8 and ~ 0.96 (Dunkley et al. 2008; 
Komatsu et al. 2008), which implies that non-linearities 
in the Millennium Simulation should be stronger than 
those in our Universe. 

The Millennium Simulation was carried out using the 
GADGET code (Springel et al. 2001; Springel 2005). The 
GADGET uses the tree Particle Mesh (tree-PM) gravity 
solver, which tends to have a larger dynamic range than 
the traditional PM solver for the same box size and the 
same number of particles (and meshes) (Heitmann et al. 
2007). Therefore, the matter power spectrum from the 
Millennium Simulation does not suffer from an artifi- 
cial suppression of power as much as those from the PM 
codes. 

The initial particle distribution was generated at 
the initial rcdshift of Zini = 127 using the standard 
Zel'dovich approximation. While the initial conditions 
generated from the standard Zel'dovich approximation 
tend to produce an artificial suppression of power at later 
times, and the higher-order scheme such as the second- 
order Lagrangian perturbation theory usually produces 
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Fig. 1. — Matter power spectrum at ^ = 0, 1, 2, 3, 4, 5 and 
6 {from top to bottom) derived from the Millennium Simulation 
{dashed lines), the 3rd-order PT {solid lines), and the linear PT 
{dot-dashed lines). 
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Fig. 2. — Dimensionless matter power spectrum, A^(fc), at z = 1, 
2, 3, 4, 5, and 6. The dashed and solid lines show the Millennium 
Simulation data and the 3rd-order PT calculation, respectively. 
The dot-dashed lines show the linear power spectrum. 



better results (Scoccimarro 1998; Crocce et al. 2006), the 
initial redshift of the Millennium Simulation, Zini = 127, 
is reasonably high for the resulting power spectra to have 
converged in the weakly non-linear regime. 

The mass of each dark matter particle in the sim- 
ulation is Mdm = 8.6 X IO^A/q/Zi. They require at 
least 20 particles per halo for their halo finder, and 
thus the minimum mass resolution of halos is given by 
Mhaio > ^OMdm ^ 1-7 X 10^° Mq/H. Therefore, the 
Millennium Simulation covers the mass range that is rel- 
evant to real galaxy surveys that would detect galaxies 
with masses in the range of M ~ 10^^ — 10^^ Mq. This 
property distinguishes our study from the previous stud- 
ies on non-linear distortion of BAOs due to galaxy bias 
(e.g., Smith et al. 2007; Huff et al. 2007), whose mass 
resolution was greater than ~ 10^^ Mq. 

In addition to the dark matter halos, the Millennium 
database^ also provides galaxy catalogues from two dif- 
ferent semi-analytic galaxy formation models (De Lucia 
& Blaizot 2007; Croton et al. 2006; Bower et al. 2006; 
Benson et al. 2003; Cole et al. 2000). These catalogues 
give us an excellent opportunity for testing validity of 
the non-linear galaxy power spectrum model based upon 
the 3rd-order PT with the unprecedented precision. 

3.2. Srd-order PT versus Millennium Simulation: 
Dark Matter Power Spectrum 

First, we compare the matter power spectrum from 
the Millennium simulation with the 3rd-order PT cal- 
culation. The matter power spectrum we use here was 
measured directly from the Millennium simulation on the 

fly." 

Figure 1 shows the matter power spectrum from the 
Millennium simulation (dashed lines), the 3rd-order PT 
calculation (solid lines), and the linear PT (dot-dashed 
lines) for seven different redshifts, 2 = 0, 1, 2, 3, 4, 5, 
and 6. The analytical calculation of the 3rd-order PT re- 
produces the non-linear matter power spectrum from the 

® http: / /www. g-vo.org/MyMillennium2/ 

We thank Volker Springel for providing us with the matter 
power spectrum data. 
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Fig. 3. — Fractional difference between the matter power spectra 
from the 3rd-order PT and that from the Millennium Simulation, 
Pm"^{^)/Pm^ — 1 (dots with errorbars). The solid lines show the 
perfect match, while the dashed lines show ±2% accuracy. We 
also show kmax{z), below which we trust the prediction from the 
3rd-order PT, as a vertical dotted line. 

Millennium Simulation accurately at high redshifts, i.e., 
z > 1, up to certain maximum wavenumbers, kmax, that 
will be specified below. To facilitate the comparison bet- 
ter, we show the dimensionless matter power spectrum, 
A^(fc) = fc3p„(fc)/27r2, in Figure 2. 

We find the maximum wavenumber, kmaxiz), below 
which we trust the prediction from the 3rd-order PT, by 
comparing the matter power spectrum from PT and the 
Millennium Simulation. The values of kmax found here 
will be used later when we analyze the halo/galaxy power 
spectra. 

In Paper I we have defined k^ax such that the frac- 
tional difference between PT and the average of ^ 100 
simulations is 1%. Here, we have only one realization, 
and thus the results are subject to statistical fluctua- 
tions that might be peculiar to this particular realiza- 
tion. Therefore, we relax our criteria for k^ax' we define 
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TABLE 1 

Maximum wavenumbers, 

kmax, FOR THE MILLENNIUM 

Simulation 



z 


kmax 

(/i/Mpc) 


(h/Mpc) 


6 


1.5 


1.99 


5 


1.3 


1.37 


4 


1.2 


1.02 


3 


1.0 


0.60 


2 


0.25 


0.35 


1 


0.15 


0.20 



Note. — z: redshift 
kmax '■ the maximum wavenum- 
ber for the simulated Pm(k) to 
agree with the PT calculation 
at 2% accuracy within the sta- 
tistical error of the Millennium 
Simulation 

kmax-- kmax is defined by 
A^(fcmaa;) = 0.4 which is the 
criteria recommended in Paper 
I. 




Wavenumber [h/Mpc] Wovenumber [h/Mpc] 



Fig. 4. — Distortion of BAOs duo to non-linear matter clustering. 
All of the power spectra have been divided by a smooth power 
spectrum without baryonic oscillations from eq. (29) of Eisenstein 
& Hu (1998). The error bars show the simulation data, while the 
solid lines show the PT calculations. The dot-dashed lines show 
the linear theory calculations. The power spectrum data shown 
here have been taken from Figure 6 of Springel et al. (2005). 

kmax such that the fractional difference between PT and 
the Millennium Simulation is 2%. 

Figure 3 shows the fractional differences at z = f , 2, 3, 
4, 5, and 6. Since we have only one realization, we cannot 
compute statistical errors from the standard deviation of 
multiple realizations. Therefore, we derive errors from 
the leading-order 4-point function assuming Gaussianity 
of the underlying density fluctuations (sec Appendix A), 
(^Pik) — PiM) / ^ ^k, where Nk is the number of inde- 
pendent Fourier modes per bin at a given k shown in 
Figure 3. 

We give the values of kmax in Table 1. We shall use 
these values when we fit the halo/galaxy power spectrum 
in the next section. Note that kmax decreases rapidly 
below z — 2. It is because P{k) / PpT{k) — 1 is not a 
monotonic function of k. The dip in P{k)/ PpT{k) — 1 



is larger than 2% at lower redshift, z < 2, while it is 
inside of the 2% range at z > 3. Therefore, our criteria 
of 2% make that sudden change. This feature is due to 
the limitation of the standard 3rd order PT. However, we 
can remove this feature by using the improved perturba- 
tion theory, e.g. using renormalization group techniques. 
(See, Figure 9 of Matarrese & Pietroni (2007).) 

We also give the values of kmax, for which A'^{kmax) = 
0.4 (criteria recommended in Paper I). The difference be- 
tween kmax and kmax IS probably due to the fact that we 
have only one realization of the Millennium Simulation, 
and thus estimation of kmax is noisier. Note that the 
values of kmax given in Table 1 are smaller than those 
given in Paper I. This is simply because erg of the Millen- 
nium Simulation (as = 0.9) is larger than that of Paper 
I ((78 - 0.8). 

In Figure 4 we show the matter power spectra divided a 
smooth spectra without BAOs (Eq. (29) of Eisenstein & 
Hu 1998). The results are consistent with what we have 
found in Paper I: although BAOs in the matter power 
spectrum are distorted heavily by non-linear evolution of 
matter fluctuations, the analytical predictions from the 
3rd-order PT capture the distortions very well at high 
redshifts, z > 2. 

At lower redshifts, z ~ 1, the 3rd-order PT is clearly 
insufficient, and one needs to go beyond the standard PT. 
This is a subject of recent studies (Crocce & Scoccimarro 
2008; Matarrese & Pietroni 2007; Taruya & Hiramatsu 
2008; Valageas 2007; Matsubara 2008; McDonald 2007). 

4. HALO/GALAXY POWER SPECTRUM AND THE 
NON-LINEAR BIAS MODEL 

In this section we compare the 3rd-order PT galaxy 
power spectrum with the power spectra of dark matter 
halos and galaxies estimated from the Millennium Sim- 
ulation. After briefly describing the analysis method in 
§ 4.1, we analyze the halo bias and galaxy bias in § 4.2 
and § 4.3, respectively. We then study the dependence 
of bias parameters on halo/galaxy mass in § 4.4. 

4.1. Analysis method 

We choose six redshifts between 1 < z < 6 from 63 
snapshots of the Millennium Simulation, and use all the 
available catalog of halos (MPA Halo (MHalo) , hereafter 
'halo') and two galaxy catalogues (MPA Galaxies, here- 
after 'Mgalaxy'; Durham Galaxies, hereafter 'Dgalaxy') 
at each redshift. The exact values of redshifts and the 
other relevant information of chosen snapshots are sum- 
marized in Table 2. 

Halos arc the groups of matter particles found directly 
from the Millennium Simulation. First, the dark mat- 
ter groups (called EOF group) are identified by using 
Friends-of-Friends (FoF) algorithm with a linking length 
equal to 0.2 of the mean particle separation. Then, each 
FoF group is divided into the gravitationally bound local 
overdense regions, which we call halos here. 

Mgalaxies and Dgalaxies are the galaxies assigned to 
the halos using two different semi-analytic galaxy forma- 
tion codes: L-Galaxies (Mgalaxies, De Lucia & Blaizot 
2007; Croton et al. 2006) and GALFORM (Dgalaxies, 
Bower et al. 2006; Benson et al. 2003; Cole et al. 2000). 

While both models successfully explain a number of 
observational properties of galaxies like the break shape 
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TABLE 2 

Summary of six snapshots from the Millennium Simulation 



z 


^show 




([Mpc//j]3) 




([Mpc//i]3) 




([Mpc/?i]3) 


5.724 


6 


5,741,720 


21.770 


6,267,471 


19.944 


4,562,368 


27.398 


4.888 


5 


8,599,981 


14.535 


9,724,669 


12.854 


7,604,063 


16.439 


4.179 


4 


11,338,698 


11.024 


13,272,933 


9.418 


10,960,404 


11.405 


3.060 


3 


15,449,221 


8.091 


19,325,842 


6.468 


17,238,935 


7.251 


2.070 


2 


17,930,143 


6.972 


23,885,840 


5.233 


22,962,129 


5.444 


1.078 


1 


18,580,497 


6.727 


26,359,329 


4.742 


27,615,058 


4.527 



Note. — z: the exact redshift of each snapshot 
^show^ the redshift we quote in this paper 

AT^ji the number of MPA halos in each snapshot; l/n/ji the corresponding Poisson shot noise 
NMg' the number of MPA galaxies in each snapshot; I/umq' the corresponding Poisson shot 
noise 

Nbq'- the number of Durham galaxies in each snapshot; 1/nog' the corresponding Poisson shot 

noise 



of the galaxy luminosity function, star formation rate, 
etc, they differ in detailed implementation. For example, 
while the L-Galaxics code uses the halo merger tree con- 
structed by MHalos, the GALFORM code uses different 
criteria for identifying subhalos inside the FOF group, 
and thus uses a different merger tree. Also, two models 
use different gas cooling prescriptions and different initial 
mass functions (IMF) of star formation: L-Galaxics and 
GALFORM define the cooling radius, within which gas 
has a sufficient time to cool, by comparing the cooling 
time with halo dynamical time and the age of the halo, 
respectively. Cold gas turns into stars with two differ- 
ent IMFs: the L-Galaxics code ucscs IMF from Chabricr 
(2003) and the GALFORM code uses Kennicutt (1983). 
In addition to that, they treat AGN (Active Galactic 
Nucleus) feedback differently: the L-Galaxies code intro- 
duces a parametric model of AGN feedback depending on 
the black hole mass and the virial velocity of halo, and 
the GALFORM code imposes the condition that cooling 
flow is quenched when the energy released by radiative 
cooling (cooling luminosity) is less than some fraction 
(which is modeled by a parameter, csmbh) of Eddington 
luminosity of the black hole. For more detailed compar- 
ison of the two model, we refer readers to the original 
papers cited above. 

We compute the halo/galaxy power spectra from the 
Millennium Simulation as follows: 

(1) Use the Cloud-In-Cell (CIC) mass distribution 

scheme to calculate the density field on 1024^ reg- 
ular grid points from each catalog. 

(2) Fourier-transform the discretized density field us- 
ing FFTW^ 

(3) Deconvolve the effect of the CIC pixelization and 
aliasing effect. We divide P(k, z) = |(J(k, z)p at 
each cell by the following window function (Jing 
2005): 



w^(k)=n 



1- 



- sm' I — — 
3 \2kN 



(5) 



where k = (fci,fc2,A:3), and Un = 7r/i? is the 
Nyquist frequency, [H is the physical size of the 

* http://www.fftw.org 



grid).^ 

(4) Compute P{k, z) by taking the angular average of 
CIC-corrected P(k, z) = |(5(k, 2)p within a spher- 
ical shell defined by fc - Afc/2 < |k| < fc + Afc/2. 
Here, Afc = 27r/500 [ft/Mpc] is the fundamental 
frequency that corresponds to the box size of the 
Millennium Simulation. 

From the measured power spectra we find the maxi- 
mum likelihood values of the bias parameters using the 
likelihood function approximated as a Gaussian: 



>C(6i,62,A) 



n 



1 



: exp 



Pi 



2a% 



(6) 

where /Cj's are integer multiples of the fundamental fre- 
quency Afc, Pobs,i is the measured power spectrum at 
k = hi, Pg^i is the theoretical model given by Eq. (2), 
and a Pi is the statistical error in the measured power 
spectrum. 

We estimate api in the same way as in § 3 (see also 
Appendix A). However, the power spectrum of the point- 
like particles like halos and galaxies includes the Poisson 
shot noise, where n is the number density of objects, 
on top of the power spectrum due to clustering. There- 
fore, api must also include the shot-noise contribution. 
We use 



crpi = up{ki 



1 



1 



where 



(7) 
(8) 



is the number of independent Fourier modes used 
for estimating the power spectrum and Pg{ki) is the 
halo/galaxy power spectrum at A; = A;,. Here, AA; = 
27r/(500 Mpc) is the fundamental wavenumber of 
the Millennium Simulation. Note that we subtract the 
Poisson shot noise contribution, Pshot — ^/n, from the 
observed power spectrum before the likelihood analysis. 

^ Note that Eq. (5) is strictly valid for the fiat (white noise) 
power spectrum, P{k) = constant. Nevertheless, it is still accurate 
for our purposes because, on small scales, both the halo and galaxy 
power spectra are dominated by the shot noise, which is also given 
by P{k) = constant. 
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Eq. (7) shows that the error on Pobs{k) depends upon 
the underlying Pg{k). For the aetual data analysis one 
should vary Pg{k) in the numerator of Eq. (6) as well 
as that in a Pi, simultaneously. However, to simplify the 
analysis, we evaluate the likelihood function in an iter- 
ative way: we first find the best-fitting Pg{k) using an 
with Pg{k) in Eq. (7) replaced by Pobs{k)- Let us call 
this Pg{k). We then use Pg{k) in Eq. (7) for finding 
the best-fitting Pg{k) that we shall report in this paper. 
Note that we iterate this procedure only once for current 
study. 

Finally, we compute the 1-d marginalized 1-ct inter- 
val (or the marginalized 68.27% confidence interval) of 
each bias parameter by integrating the likelihood func- 
tion (Eq. (6)), assuming a fiat prior on the bias parame- 
ters (see also Appendix B). 

We first analyze the power spectrum of halos (in § 4.2) 
as well as that of galaxies (in § 4.3) using all the ha- 
los and all the galaxies in the Millennium halo/galaxy 
catalogues. We then study the mass dependence of bias 
parameters in § 4.4. 

In order to show that the non-linear bias model (Eq. 2) 
provides a much better fit than the linear bias model, 
we also fit the measured power spectra with two linear 
bias models: (i) linear bias with the linear matter power 
spectrum, and (ii) linear bias with the non-linear matter 
power spectrum from the 3rd-ordcr FT. When fitting 
with the linear model, we use kmax = 0.15 [/i/Mpc] for 
all redshift bins. 

4.2. Halo power spectra 

4.2.1. Measuring non-linear halo bias parameters 

Figure 5 shows the best-fitting non-linear [solid lines) 
and linear bias models [dashed and dot-dashed lines), 
compared with the halo spectra estimated from the 
Millennium Simulation [points with errorbars). The 
smaller panels show the residuals of fits. The maximum 
wavenumber used in the fits, k^axiz), are also marked 
with the arrows (bigger panels), and the vertical lines 
(smaller panels). We find that the non- linear bias model 
provides substantially better fits than the linear bias 
models. 

We find that all of non- linear bias parameters, hi, h2, 
and Pq, are strongly degenerate, when the maximum 
wavenumbers used in the fits, kmax, are small. In Fig- 
ure 6 we show the one-dimensional marginalized distri- 
bution of bias parameters aX z = 6, as a function of 
kmax- For lower kmax, 0.3 < kmax/[h Mpc"-'] < 1.0, the 
marginalized distribution has two peaks {dashed lines), 
indicating strong degeneracy with the other parame- 
ters. The double-peak structure disappears for 1.0 < 
kmax/[h Mpc""^] < 1.5 [solid lines). 

We find that the origin of degeneracy is simply due to 
the small box size of the Millennium Simulation, i.e., the 
lack of statistics, or too a large sampling variance. To 
show this, we have generated a mock Monte Carlo real- 
ization of halo power spectra, assuming a much bigger 
box size, Lbox = 1-5 h^^ Gpc, which gives the funda- 
mental frequency of Ak = 5.0 x 10~^ h Mpc~^. Note 
that this volume roughly corresponds to that would be 
sm-vcyed by the HETDEX survey (Hill ot al. 2004). We 
have used the same non-linear matter power spectrum 
and the best-fitting bias parameters from the Millennium 



Simulation (MPA halos) when creating Monte Carlo real- 
izations. The resulting marginalized likelihood function 
at z = 6 is shown in Figure 7. The double-peak structure 
has disappeared even for low kmax, kmax = 0.3 h Mpc~^. 
Therefore, we conclude that the double-peak problem 
can be resolved simply by increasing the survey volume. 

The best-fitting non-linear halo bias parameters and 
the corresponding 1-cr intervals are summarized in Table 
3. Since we know that the double-peak structure is spuri- 
ous, we pick one peak that corresponds to the maximum 
likelihood value, and quote the l-cr interval. At z < 2, 
the bias parameters arc not constrined very well l)ec;ause 
of lower kmax and the limited statistics of the Millennium 
Simulation, and hence the two peaks are blended; thus, 
we estimate 1-cr range only from the unblended side of 
the marginalized likelihood function. Two linear bias pa- 
rameters, one with the linear matter power spectrum and 
another with the non-linear PT matter power spectrum, 
are also presented with their 1-cr intervals. 

4.2.2. Degeneracy of bias parameters 

In order to see how strongly degenerate bias parame- 
ters are, we calculate the covariancc matrix of each pair 
of bias parameters. We calculate the covariance matrix 
of each pair of bias parameters by using the Fisher in- 
formation matrix, which is the inverse of the covariance 
matrix. The Fisher information matrix for the galaxy 
power spectrum can be approximated as (Tegmark 1997) 

10 



E 



1 dP[kn,0)dP{k„,e) 



(9) 



where 6* is a vector in the parameter space, 6i — bi, 
62 ,-Po) for i = 1, 2, 3, respectively. We calculate the 
marginalized errors on the bias parameters as following. 
We first calculate the full Fisher matrix and invert it to 
estimate the covariancc matrix. Then, we get the the 
covariance matrices of any pairs of bias parameters by 
taking the 2 by 2 submatrix of the full covariance ma- 
trix. Figure 8 shows the resulting 2-cr (95.45% interval) 
contour for the bias parameters at z = 4. We find the 
strong degeneracy between Pq and 62. We also find that 
61 is degenerate with the other two parameters. On top 
of the error contours for the Millennium Simulation, we 

1" Eq. (9) is equivalent to Eq. (6) in Tegmark (1997). The 
number of k mode in real space power spectrum from a survey of 
volume V is (See Appendix A for notations.) 

^ iTTklSkr^ ^ Vk^Skn 

~ 2(5fc„)3 ~ 47r2 • 
Then, the variance of power spectrum (Eq. (7)) becomes 



(Tp{k„) 



Vk'iSkn 



P(kn) + - 



47r2p(fe„)2 



k'^Skn Veff(fcn)' 



where Veff is the constant density version of Eq. (5) of Tegmark 
(1997). Finally, the elements of Fisher matrix are given by 



-y— 



dP{k„,e) dP{kr„e) 



p(fe„) de^ ae^ 
]_ ^ ap(fc„,(?) dP{kr„e) v,s{k„)klsk„ 



47r2 



d9i 



P(fcn)2 



which is the same as Eq. (6) in Tegmark (1997). 
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Fig. 5. — Halo power spectra from the Millennium Simulation at z = 1, 2, 3, 4, 5, and 6. Also shown in smaller panels are the residual of 
fits. The points with errorbars show the measured halo power spectra, while the solid, dashed, and dot-dashed lines show the best-fitting 
non-linear bias model (Eq. (2)), the best-fitting linear bias with the non-linear matter power spectrum, and the best-fitting linear bias with 
the linear matter power spectrum, respectively. Both linear models have been fit for k^^x linear = O.fS [h Mpc~^], whereas kmax{z) given 
in Table 1 (also marked in each panel) have been used for the non-linear bias model. 



TABLE 3 

Non-linear halo bias parameters and the corresponding 68% interval estimated 

FROM THE MPA halo POWER SPECTRA 



z 


bi 


h 


Po 

([Mpc//i]«) 




b^^ 




rST 
"2 


6 


3.4f±0.01 


1.52±0.03 


141.86±3.73 


3.50±0.03 


3.51±0.03 


3.69 


2.10 


5 


2.76±0.01 


0.9f±0.03 


57.77±2.84 


2.79±0.03 


2.80±0.03 


3.16 


1.70 


4 


2.27±0.0f 


0.52±0.03 


22.65±1.88 


2.28±0.02 


2.29±0.02 


2.77 


1.40 


3 


1.52±0.0f 


-1.94±0.05 


329.42±f0.6 


f.62±0.0f 


1.63±0.01 


2.23 


1.07 


2 


l.f0±0.06 


-2.12±0.65 


507.25±214.7 


f.l9±0.0f 


1.20±0.01 


1.84 


0.76 


1 


0.74±0.09 


-3.05±f.49 


15ff.46±526.7 


0.88±0.01 


0.90±0.0f 


1.54 


0.58 



Note. — z: redshift 
''I I Po- non-linear bias parameters 

b^: linear bias parameter for the linear bias model with the 3rd-order matter power spec- 
trum 

bf^: linear bias parameter for the linear bias model with the linear power spectrum 

non-linear bias parameters calculated from the Sheth-Tormen model. 

Caution: We estimate 1-a ranges for the low redshift (z < Z) only for the peak which 
involves the maximum likelihood value. If two peaks in maginalized likelihood function are 
blended, we use only unblended side of the peak to estimate the 1-a range. 



show the expected contour from the HETDEX hke sur- 
vey (1.5 Gpc/h). Since the volume of HETDEX like sur- 
vey is 27 times bigger, the likelihood functions and the 
error-contours are about a factor of 5 smaller than those 
from the Millennium Simulation. Other than that, two 
contours follow the same trend. Results are the same for 
the other redshifts. 

4.2.3. Comparison with the halo model predictions 



The effective linear bias, 6i, is larger at higher red- 
shifts. This is the expected result, as halos of mass 
greater than lO^^M© were rarer in the earlier time, 
resulting in the larger bias. 

From the same reason, we expect that the non-linear 
bias parameters, 62 and Pq, are also larger at higher 
z. While we observe the expected trend at z > 4, 
the results from 2; < 3 are somewhat peculiar. This 
is probably due to the large sampling variance making 
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Fig. 6. — One-dimensional marginalized distribution of non- 
linear bias parameters at 2 = 6: from top to bottom pan- 
els, Po, te, and bi. Different lines show the different values of 
kmax used for the fits. The dashed and solid lines correspond to 
0.3 < kmax/lh Mpc-i] < 1.0 and 1.0 < kmax/lh Mpc-^] < 1.5, re- 
spectively. The double-peak structure disappears for higher kmax- 




Fig. 7. — Same as Figure 6, but for a Monte Carlo simulation of 
a galaxy survey with a bigger box size, Li,q^ = 1.5 Gpc/h. 

the fits unstable: for 2; < 3 the maximum wavenum- 
bers inferred from the matter power spectra are less than 
1.0 h Mpc~^ (see Table 1), which makes the likelihood 
function double-peaked and leaves the bias parameters 
poorly constrained. 

How do these bias parameters compare with the ex- 
pected values? We use the halo model for computing the 
mass-averaged bias parameters, bf^ and b^^ , assuming 
that the minimum mass is given by the minimum mass 
of the MPA halo catalogue, Mmtn 1-72 x lO^^Mg/Zi: 



j-Mrr 

Jm,„ 



^Mh{M)dM 



(10) 



where dn/dM is the Sheth-Tormen mass function and 
bi{M) is the i-th order bias parameter from Scoccimarro 



Fig. 8. — One-dimensional marginalized constraints and two- 
dimensional joint marginalized constraint of 2-a (95.45% CL) range 
for bias parameters {bi,b2,Po)- Covariance matrices are calculated 
from the Fisher information matrix (Eq. (9)) with the best-fitting 
bias parameters for halo at 2 = 4. 



et al. (2001). 

There is one subtlety. The halo model predicts the co- 
efficients of the Taylor series (Eq. (1)), whereas what we 
have measured are the re-parametrized bias parameters 
given by Eq. (3). However, the formula for bi includes 
the mass variance, cr^, which depends on our choice of a 
smoothing scale that is not well defined. This shows how 
difficult it is to actually compute the halo power spec- 
trum from the halo model. While the measured values 
of 61 and the predicted bf^ compare reasonably well, it 
is clear that we cannot use the predicted bias values for 
doing cosmology. 

For &2, we compute = where 61 is the 

best-fitting value from the Millennium Simulation. This 
would give us a semi apple-to-apple comparison. Never- 
theless, while the agreement is reasonable at z > 4, the 
halo model predictions should not be used for predicting 
62 either. 

4.2.4. Comments on the bispectrum 

While the degeneracy between bias parameters may 
appear to be a serious issue, there is actually a powerful 
way of breaking degeneracy: the bispectrum, the Fourier 
transform of the 3-point correlation function (Matarrese 
et al. 1997). The reduced bispectrum, which is the bis- 
pectrum normalized properly by the power spectrum, 
depends primarily on two bias parameters, bi and 62, 
nearly independent of the cosmological parameters (Se- 
fusatti et al. 2006). Therefore, one can use this property 
to fix the bias parameters, and use the power spectrum 
for determining the cosmological parameters and the re- 
maining bias parameter, Pq. Sefusatti & Komatsu (2007) 
have shown that the planned high-z galaxy surveys would 
be able to determine 61 and 62 with a few percent accu- 
racy. 

We have begun studying the bispectrum of the Mil- 
lennium Simulation. Our preliminary results show that 
we can indeed obtain better constraints on bi and 62 
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0.0 0.1 0.2 0.3 0.4 0.0 0.1 0.2 0.3 0.4 

wovenumber k [h/Mpc] wavenumber k [h/Mpc] 




wavenumber k [h/Mpc] wovenumber k [h/Mpc] 



Fig. 9. — Distortion of BAOs due to non-linear matter clustering and non-linear halo bias. All of the power spectra have been divided by 
a smooth power spectrum without baryonic oscillations from equation (29) of Eisenstein & Hu (1998). The errorbars show the Millennium 
Simulation, while the solid lines show the PT calculations. The dashed lines show the linear bias model with the non-linear matter power 
spectrum, and the dot-dashed lines show the linear bias model with the linear matter power spectrum. Therefore, the difference between 
the solid lines and the dashed lines shows the distortion solely due to non-linear halo bias. 



from the bispectrum than from the power spectrum, pro- 
vided that we use the same k^ax for both the bispectrum 
and power spectrum analyses. Therefore, even when the 
non-hnear bias parameters are poorly constrained by the 
power spectrum alone, or have the double-peak likeli- 
hood function from the power spectrum for lower kmax, 
we can still find tight constraints on bi and 62 from the 
bispectrum. These results will be reported elsewhere. 

4.2.5. Effects on BAOs 

In Figure 9 we show the distortion of BAO features 
due to non-linear matter clustering and non-linear bias. 
To show only the distortions of BAOs at each redshift, 
we have divided the halo power spectra by smooth power 
spectra without baryonic oscillations from equation (29) 
of Eisenstein & Hu (1998) with hi multiphed. Three 
theoretical models are shown: the non-linear bias model 
[solid line), a linear bias model with the 3rd-order matter 
power spectrum [dashed line), and a linear bias model 
with the linear matter power spectrum [dot-dashed line). 
Therefore, the difference between the solid lines and the 
dashed lines is solely due to non-linear halo bias. 

The importance of non-linear bias affecting BAOs 
grows with z; however, as the matter clustering is weaker 
at higher z, the 3rd-order PT still performs better than 
at lower z. In other words, the higher bias at higher z 
does not mean that surveys at higher z are worse at mea- 
suring BAOs; on the contrary, it is still easier to model 
the halo power spectrum at higher z than at lower z. For 
z > 3, where kmax is larger than the BAO scale, the dis- 
tortion of BAOs is modeled very well by the non-linear 



bias model, while the linear bias models fail badly. 

The sampling variance of the Millennium Simulation at 
fc < 0.15 /i Mpc^^ is too large for us to study the distor- 
tion on the first two BAO peaks. Since the PT performs 
well at higher k, we expect that the PT describes the 
first two peaks even better. However, to show this ex- 
plicitly one would need to run a bigger simulation with 
a bigger volume with the same mass resolution as the 
Millennium Simulation, which should be entirely doable 
with the existing computing resources. 

4.3. Galaxy power spectra 
4.3.1. Measuring non-linear galaxy bias parameters 

Figures 10 and 11 show the galaxy power spectra esti- 
mated from the MPA (Mgalaxy) and Durham (Dgalaxy) 
galaxy catalogues, respectively. Here, we basically find 
the same story as we have found for the halo power spec- 
tra (§ 4.2): for k < kmax the non-linear bias model 
fits both galaxy power spectra (Mgalaxy and Dgalaxy), 
whereas the linear bias models fit neither. 

The galaxy bias parameters extracted from Mgalaxy 
and Dgalaxy are summarized in Table 4 and 5, respec- 
tively. While the bias parameters are different for halo, 
Mgalaxy and Dgalaxy, they follow the same trend: (i) hi 
becomes lower as the redshift becomes lower, and (ii) 62 
also becomes lower as the redshift becomes lower when 
z > 3, but suddenly changes to large negative values at 
z < 3. As we have already pointed out in § 4.2, this sud- 
den peculiar change is most likely caused by the double- 
peak nature of the likelihood function, owing to the poor 
statistical power for lower kmax at lower z. In order to 



Non-Linear Bias, BAOs and Millennium Simulation 



11 




Fig. 10. — Same as Figure 5, but for the MPA galaxy catalogue (Mgalaxy). 




Fig. 11.— 



Same as Figure 5, but for the Durham galaxy catalogue (Dgalaxy). 
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TABLE 4 

Non-linear halo bias parameters and the corresponding 68% interval estimated 

FROM the MPA galaxy POWER SPECTRA 



z 


Si 


62 


{[/t/Mpc]:^) 


6f 




6f^ 




6 


3.55±0.01 


1.70±0.03 


194.23±4.45 


3.67±0.03 


3.68±0.03 


3.10 


1.03 


5 


2.93±0.01 


1.08±0.03 


94.08it3.71 


2.97±0.03 


2.98±0.03 


2.55 


0.59 


4 


2.46±0.01 


0.68±0.03 


47.79±2.84 


2.47±0.02 


2.48±0.02 


2.13 


0.28 


3 


1.69±0.01 


-2.12±0.04 


486.69±12.7 


1.83±0.02 


1.83±0.02 


1.58 


-0.12 


2 


1.28±0.08 


-2.16it0.64 


738.22±291.3 


1.40±0.01 


1.40±0.01 


1.19 


-0.34 


1 


0.89±0.11 


-2.97±1.60 


2248.35±786.13 


1.09±0.01 


l.lOitO.Ol 


0.91 


-0.45 



Note. — 2: redshift 
''li S21 P{i'- non-linear bias parameters 

h\ : linear bias parameter for the linear bias model with the 3rd-order matter power spectrum 
b^^: linear bias parameter for the linear bias model with the linear power spectrum 

bf''" , 62^: non-linear bias parameters calculated from the Sheth-Tormen model, 62 '^=62^/61 
Caution: We estimate 1-cr ranges for the low redshift (z <3) only for the peak which involves 
the maximum likelihood value. If two peaks in maginalized likelihood function are blended, 
we use only unblended side of the peak to estimate the 1-a range. 



TABLE 5 

Non-linear halo bias parameters and the corresponding 68% interval estimated 
from the Durham galaxy power spectra 



z 


bi 


62 


([VMpc]3) 


6f 


bi^ 


bf^ 




6 


3.73±0.01 


1.96it0.03 


288.39±5.82 


3.90it0.04 


3.90±0.04 


3.10 


0.98 


5 


3.07±0.01 


1.26±0.03 


143.15±4.81 


3.15±0.03 


3.15±0.03 


2.55 


0.56 


4 


2.57±0.01 


0.83±0.03 


78.97±3.93 


2.60±0.02 


2.61±0.02 


2.13 


0.26 


3 


1.75±0.01 


-2.26±0.04 


604.65±13.8 


1.92±0.02 


1.93±0.02 


1.58 


-0.11 


2 


1.36±0.08 


-2.14±0.65 


843.49±331.4 


1.49±0.01 


1.50±0.01 


1.19 


-0.32 


1 


0.96±0.11 


-2.94±1.62 


2640.20±960.32 


1.18±0.01 


1.20±0.01 


0.91 


-0.42 



Note. — 2: redshift 
&i, 62, Po- non-linear bias parameters 

6^ : linear bias parameter for the linear bias model with the 3rd-order matter power spectrum 
bi^: linear bias parameter for the linear bias model with the linear power spectrum 

non-linear bias parameters calculated from the Shcth-Tormcn model, b2'^ =b2^ /bi 
Caution: We estimate 1-cr ranges for the low redshift (z < 3) only for the peak which involves 
the maximum likelihood value. If two peaks in maginalized likelihood function are blended, 
we use only unblended side of the peak to estimate the 1-a range. 
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Fig. 12. — Same as Figure 9, but for the MPA galaxy power spectrum (Mgalaxy). 
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Fig. 13.— 



Same as Figure 9, but for the Durham galaxy power spectrum (Dgalaxy). 
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study 62 further with better statistics, one needs a bigger 
simulation. 

4.3.2. Comparison with the simplest HOD predictions 

To give a rough theoretical guide for the galaxy bias 
parameters, we assume that each dark matter halo hosts 
one galaxy above a certain minimum mass. This specifies 
the form of the HOD completely: {N\M) = 1, with the 
same lower mass cut-off as the minimum mass of the halo, 
M„i„ = 1.72 X W^^Mq/H. 

This is utterly simplistic, and is probably not correct 
for describing Mgalaxy or Dgalaxy. Nevertheless, we give 
the resulting values in Table 4 and 5, which have been 
computed from 

^ST _ ImZ: mb^(M){N\M)dM 

where dn/dM is the Sheth-Tormen mass function and 

bi{M) is the i-th order bias parameter from Scoccimarro 
et al. (2001). To compare with the non-linear bias pa- 
rameters, we also calculate 62 = fef^/^i- 

While these "predictions" give values that are reason- 
ably close to the ones obtained from the fits, they are 
many a away from the best-fitting values. The freedom 
in the choice of the HOD may be used to make the pre- 
dicted values match the best-fitting values; however, such 
an approach would require at least as many free param- 
eters as the non-linear bias parameters. Also, given that 
the halo bias prediction fails to fit the halo power spectra, 
the HOD approach, which is still based upon knowing the 
halo bias, is bound to fail as well. 

4.3.3. Effects on BAOs 

In Figures 12 and 13 we show how non-linear galaxy 
bias distorts the structure of BAOs. Again, we find the 
same story as we have found for the halo bias: the galaxy 
bias distorts BAOs more at higher z because, for a given 
mass, galaxies were rarer at higher rcdshifts and thus 
more highly biased, while the quality of the fits is better 
at higher z because of less non-linearity in the matter 
clustering. 

In all cases (halo, Mgalaxy and Dgalaxy) the non-linear 
bias model given by Eq. (2) provides very good fits, and 
describes how bias modifies BAOs. 

4.4. Mass dependence of bias parameters and effects 
on BAOs 

So far, we have used all the available halos and galaxies 
in the Millennium catalogues for computing the halo and 

galaxy power spectra. In this section we divide the sam- 
ples into different mass bins given by Af < 5 x IQ^^Mq j h, 
5 X 10^°MQ/h < M < 10"Mq//i, 10"Mq//i < M < 
5 X lO"M0//i, 5 X lO"M0//i < M < lO^^Mo/Zi, and 
study how the derived bias parameters depend on mass. 

The power spectra of the selected halos and galaxies 
in a given mass bin are calculated and fit in the exactly 
same manner as before. Note that we shall use only the 
halo and Mgalaxy, as we expect that Dgalaxy would give 
similar results to Mgalaxy. 

Figures 14 and 15 show the results for the halo and 
galaxies, respectively. To compare the power spectra of 
different mass bins in the same panel, and highlight the 



effects on BAOs at the same time, we have divided the 
power spectra by a non-oscillating matter power spec- 
trum from equation (29) of Eisenstein & Hu (1998) with 

the best-fitting from each mass bin multiplied. These 

figures show the expected results: the larger the mass 
is, the larger the non-linear bias becomes. Nevertheless, 
the 3rd-order PT calculation captures the dependence on 
mass well, and there is no evidence for failure of the PT 
for highly biased objects. 

In Tables 6 and 7 we give values of the measured bias 
parameters as well as the "predicted" values. For all 
redshifts we see the expected trend again: the higher the 
mass is, the larger the effective linear bias (bi) is. The 
same is true for 62 for z > 3, while it is not as apparent 
for lower redshifts, and eventually becomes almost fuzzy 
for 2; = 1. Again, these are probably due to the lack of 
statistics due to lower values of kmax at lower z, and we 
need a bigger simulation to handle these cases with more 
statistics. 

The high values of bias do not mean failure of PT. 
The PT galaxy power spectrum model fails only when 
A^(fc, z) exceeds ~ 0.4 (Paper I), or the locality of bias is 
violated. Overall, we find that the non-linear bias model 
given by Eq. (2) performs well for halos and galaxies with 
all mass bins, provided that we use the data only up to 
kmax determined from the matter power spectra. This 
implies that the locality assumption is a good approx- 
imation for k < kmax] however, is it good enough for 
us to extract cosmology from the observed galaxy power 
spectra? 

5. COSMOLOGICAL PARAMETER ESTIMATION 
WITH THE NON-LINEAR BIAS MODEL 

In the previous sections we have shown that the 3rd- 
order PT galaxy power spectrum given by Eq. (2) pro- 
vides good fits to the galaxy power spectrum data from 
the Millennium Simulation. 

However, we must not forget that Eq. (2) contains 3 
free parameters, 61, 62, and Pq. With 3 parameters it 
may seem that it should not be so difficult to fit smooth 
curves like those shown in, e.g.. Figure 10. 

While the quality of fits is important, it is not the end 
of story. We must also show that Eq. (2) can be used for 
extracting the correct cosmological parameters from the 
observed galaxy power spectra. 

In this section we shall extract the distance scale from 
the galaxy power spectra of the Millennium Simulation, 
and compare them with the input values that were used 
to generate the simulation. If they do not agree, Eq. (2) 
must be discarded. If they do, we should proceed to the 
next level by including non-linear redshift space distor- 
tion. 

5.1. Measuring Distance Scale 
5.1.1. Background 

Dark energy influences the expansion rate of the uni- 
verse as well as the growth of structure (see Copeland 
et al. 2006, for a recent review). 

The cosmological distances, such as the luminosity dis- 
tance, Di^{z), and angular diameter distance, Da{z), are 
powerful tools for measuring the expansion rates of the 
universe, H{z), over a wide range of redshifts. Indeed, 
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Fig. 14. — Mass dependence of distortion of BAOs due to non-linear bias. Four mass bins, A/ < 5 X W^'-'Mq/H, 5 X lO^'^ Mq/H < M < 
10"MQ/h, IO^'^Mq/H < M < 5 X lOi^Me/h, and 5 X lO^^MQ/h < M < IO^^Mq//!, are shown. (Mio stands for M / {10^'^ Mq) .) All 
of the power spectra have been divided by a smooth power spectrum without baryonic oscillations from equation (29) of Eisenstein & Hu 
(1998). The errorbars show the Millennium Simulation data, while the solid lines show the FT calculation. 
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Fig. 15.— 



Same as Figure 14, but for the MPA galaxy catalogue (Mgalaxy). 
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TABLE 6 

Mash dependenc^e of non-linear halo bl\s parameters (MPA oalok) 



z 






6i 


62 




6f^ 






(Mo/fe) 


(Mo/ft) 






([VMpc]3) 






6 


1.7E+10 


5.0E+10 


3.19+0.01 


1.28+0.03 


88.76+2.97 


2.96 


0.93 




5.0E+10 


l.OE+11 


3.90+0.02 


1.91d=0.04 


288.18+8.02 


3.52 


1.36 




l.OE+11 


5.0E+11 


4.66+0.03 


3.04+0.05 


1029.19+18.84 


4.41 


2.28 




5.0E+11 


l.OE+12 


6.41+0.14 


5.76+0.21 


6910.17+200.74 


5.95 


3.59 


5 


1.7E+10 


5.0E+10 


2.55+0.01 


0.71+0.03 


31.51+2.06 


2.41 


0.48 




5.0E+10 


l.OE+11 


3.09+0.01 


1.19+0.04 


120.84+5.69 


2.84 


0.81 




l.OE+11 


5.0E+11 


3.78+0.02 


1.79+0.04 


402.11+12.40 


3.55 


1.48 




5.0E+11 


l.OE+12 


5.14+0.07 


3.55+0.11 


2805.48+94.53 


4.71 


2.44 


4 


1.7E+10 


5.0E+10 


2.08+0.01 


0.38+0.04 


10.90+1.19 


2.01 


0.15 




5.0E+10 


l.OE+11 


2.51+0.01 


0.66+0.04 


42.34+3.52 


2.33 


0.40 




l.OE+11 


5.0E+11 


3.05+0.01 


1.08+0.04 


161.22+8.11 


2.90 


0.92 




5.0E+11 


l.OE+12 


3.80+0.05 


-4.08+0.09 


3431.19+64.81 


3.79 


1.77 


3 


1.7E+10 


5.0E+10 


1.39+0.01 


-1.83+0.05 


241.59+9.58 


1.47 


-0.25 




5.0E+10 


l.OE+11 


1.75+0.01 


0.11+0.05 


2.48+0.29 


1.67 


-0.10 




l.OE+11 


5.0E+11 


2.09+0.01 


0.35+0.04 


20.95+3.22 


2.04 


0.19 




5.0E+11 


l.OE+12 


2.78+0.02 


0.82+0.06 


171.31+21.03 


2.57 


0.60 


2 


1.7E+10 


5.0E+10 


1.01+0.05 


-1.98+0.68 


373.60+149.24 


1.11 


-0.46 




5.0E+10 


l.OE+11 


1.14+0.07 


-2.30+0.63 


627.69+204.69 


1.23 


-0.40 




l.OE+11 


5.0E+11 


1.31+0.08 


-2.34+0.63 


869.30+272.06 


1.44 


-0.28 




5.0E+11 


l.OE+12 


1.62+0.11 


-2.53+0.69 


1566.40+476.07 


1.75 


-0.05 


1 


1.7E+10 


5.0E+10 


0.68+0.09 


-3.12+1.46 


1315.40+447.14 


0.86 


-0.58 




5.0E+10 


l.OE+11 


0.75+0.10 


-3.24+1.46 


1699.76+571.75 


0.92 


-0.56 




l.OE+11 


5.0E+11 


0.85+0.09 


-2.80+1.65 


1783.07+683.57 


1.02 


-0.53 




5.0E+11 


l.OE+12 


0.99+0.11 


-2.82+1.95 


2443.76+972.16 


1.17 


-0.47 



Note. — z: redshift 
Mmin: minimum mass for a given bin 
Mmax: maximum mass for a given bin 



61, 62, Pq- non- linear bias parameters 

bf^- '^^^^ parameters from the Sheth-Tormen model, 62^=62-^/61 
Caution: We estimate 1-a ranges for the low redshift (z < 3) only for the peak which 
involves the maximum likelihood value. If two peaks in maginalized likelihood function are 
blended, we use only unblended side of the peak to estimate the 1-a range. 
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TABLE 7 

Mass dependence of non-linear galaxy bias parameters (MPA galaxies) 



61 



62 



(Mq/Zi) (Mq/H) 



Po 

([h/Mpcf) 



1ST 
"2 



1.7E+10 
5.0E-I-10 
l.OE+11 
5.0EH-11 



5.0E+10 
l.OE+11 
5.0E+11 
l.OE+12 



3.37±0.01 
3.96±0.02 
4.69±0.03 
6.43±0.14 



1.50±0.03 
2.00±0.04 
3.09±0.05 
5.79±0.20 



136.39±3.69 
325.38±8.44 
1078.72±19.25 
7046.28±201.94 



2.91 
3.49 
4.23 
5.89 



0.82 
1.31 
2.01 
3.49 



1.7E+10 
5.0E+10 
l.OE+11 
5.0E+11 



5.0E+10 
l.OE+11 
5.0E+11 
l.OE+12 



2.77+0.01 
3.16+0.01 
3.81+0.02 
5.15+0.07 



0.93+0.03 
1.27+0.04 
1.84+0.04 
3.60+0.11 



63.17+2.99 
144.20+6.16 
432.51+12.80 
2897.95+95.17 



2.38 
2.82 
3.41 
4.67 



0.40 
0.77 
1.28 
2.37 



1.7E+10 
5.0E+10 
l.OE+11 
5.0E+11 
1.7E+10 
5.0E+10 
l.OE+11 
5.0E+11 



5.0E+10 
l.OE+11 

5.0E+11 
l.OE+12 
5.0E+10 
l.OE+11 
5.0E+11 
l.OE+12 



2.33+0.01 
2.59+0.01 
3.09+0.02 
3.83+0.05 
1.62+0.01 
1.84+0.01 
2.14+0.01 
2.80+0.02 



0.58+0.03 
0.74+0.04 
1.13+0.04 
-4.09+0.09 
-2.07+0.05 
0.19+0.04 
0.38+0.04 
0.84+0.06 



32.25+2.25 
56.91+4.08 

179.81+8.52 
3507.05+64.85 

431.79+12.04 
7.04+1.04 
27.64+3.67 

191.24+21.65 



1.98 
2.32 
2.79 

3.76 



0.11 
0.37 
0.77 

1.71 



1.45 
1.66 
1.96 
2.55 



-0.22 
-0.10 
0.12 
0.57 



1.7E+10 
5.0E+10 
l.OE+11 
5.0E+11 



5.0E+10 
l.OE+11 
5.0E+11 
l.OE+12 



1.26+0.07 
1.21+0.08 
1.35+0.09 
1.65+0.11 



-2.09+0.66 
-2.35+0.62 
-2.32+0.63 
-2.50+0.69 



683.11+240.40 
738.09+231.05 
919.65+288.79 
1602.17+480.23 



1.10 
1.22 
1.40 
1.74 



-0.37 
-0.38 
-0.29 
-0.06 



1.7E+10 
5.0E+10 
l.OE+11 
5.0E+11 



5.0E+10 
l.OE+11 
5.0E+11 
l.OE+12 



0.91+0.11 
0.79+0.11 
0.87+0.10 
1.01+0.11 



-2.96+1.59 
-3.28+1.51 
-2.88+1.63 
-2.80+2.01 



2344.13+802.55 
1956.42+657.89 
1964.64+738.71 
2550.21+1007.23 



0.86 
0.92 
1.00 
1.16 



-0.43 
-0.53 
-0.51 
-0.46 



Note. — z: rcdshift 
Mmin: minimum mass for a given bin 
Mmax: maximum mass for a given bin 
61, 621 Po- non-linear bias parameters 

bf^, bf^- ^i^^ parameters from the Sheth-Tormen model, 6^^=fe^^/fei 

Caution: We estimate 1-cr ranges for the low redshift (z < 3) only for the peak which 
involves the maximum likelihood value. If two peaks in maginalized likelihood function are 
blended, we use only unblended side of the peak to estimate the 1-a range. 



it was Dl{z) measured out to high- 2: {z < 1.7) Type la 
supcrnovac that gave rise to the first compelling evidence 
for the existence of dark energy (Riess et al. 1998; Perl- 
mutter et al. 1999). The CMB power spectrum provides 
us with a high- precision measurement of £'^(z*) out to 
the photon decoupling epoch, 0* ~ 1090 (see Komatsu 
et al. 2008, for the latest determination from the WMAP 
5-ycar data). 

The galaxy power spectrum can be used for measuring 
Da{z) as well as II{z) over a wider range of redshifts. 
Prom galaxy surveys we find three-dimensional positions 

of galaxies by measuring their angular positions on the 
sky as well as their redshifts. We can then estimate the 
two-point correlation function of galaxies as a function of 
the angular separation, A^, and the redshift separation, 
A2:. To convert and A2; into the comoving separations 
perpendicular to the line of sight, Arx, and those along 
the line of sight, Ary, one needs to know Da{z) and 
II{z), respectively, as 



Ar+ = (1 + z)Da{z)M, 
cAz 



Am 



H{zy 



(12) 
(13) 



where {1 + z) appears because Da{z) is the proper (phys- 
ical) angular diameter distance, whereas Ar±_ is the co- 
moving separation. Therefore, if we know Ar± and Ar|| 
a priori, then we may use the above equations to measure 
Da{z) and H{z). 

The galaxy power spectra contain at least three dis- 
tance scales which may be used in the place of Ar± and 



Ar||: (i) the sound horizon size at the so-called baryon 

drag epoch, z^rag — 1020, at which baryons were released 
from the baryon-photon plasma, (ii) the photon horizon 
size at the matter-radiation equality, z^q ~ 3200, and 
(iii) the Silk damping scale (see, e.g., Eisenstein & Hu 
1998). 

In Fourier space, we may write the observed power 
spectrum as (Seo & Eisenstein 2003) 

Po,sih,k.,z)-' ^V^true(.)- 



DA,tTue{z) 



xPt, 



-Da, true (2) . 



H{z) 



H{z) 



■k\\ 



(14) 



Da{z) Htrue{z) 

where kx and fcy are the wavenumbers perpendicular 
to and parallel to the line of sight, respectively, and 

Pirueik). DA,true{z), and Htrue(z) are the true, under- 
lying values. We then vary Da{z) and H{z), trying to 
estimate DA,true{z) and Htrue{z). 

There are two ways of measuring Da{z) and H{z) from 
the galaxy power spectra: 

(1) Use BAOs. The BAOs contain the information of 

one of the standard rulers, the sound horizon size 
at Zdrag- This mcthod relies on measuring only 
the phases of BAOs. which are markedly insensi- 
tive to all the non-linear effects (clustering, bias, 
and redshift space distortion) (Seo & Eisenstein 
2005; Eisenstein et al. 2007; Nishimichi et al. 2007; 
Smith et al. 2008; Angulo et al. 2008; Sanchez et al. 
2008; Seo et al. 2008; Shoji et al. 2008), despite 
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the fact that the amphtude is distorted by non- 
Unearities (see Figures 4, 9, 12, and 13). Therefore, 
BAOs provide a robust means to measure Da{z) 
and H{z), and they have been used for determin- 
ing D\H-^ out to z = 0.2 from the SDSS main 
galaxy sample and 2dFGRS, as well as to 2; = 0.35 
from the SDSS Luminous Red Galaxy (LRG) sam- 
ple (Eisenstein et al. 2005; Percivai et al. 2007); 
however, since they use only one standard ruler, the 
constraints on Da{z) and H{z) from the B AO-only 
analysis are weaker than the full analysis (Shoji 
et al. 2008). 

(2) Use the entire shpae of the power spectrum. This 
approach gives the best determination (i.e., the 
smallest error) of Da{z) and H{z), as it uses all 
the standard rulers encoded in the galaxy power 
spectrum; however, one must understand the dis- 
tortions of the shape of the power spectrum due 
to non-linear effects. The question is, "is the 3rd- 
order (or higher) PT good enough for correcting 
the key non-linear effects?" 

In this paper we show, for the first time, that we can 
extract the distance scale using the 3rd-order PT galaxy 
power spectrum in real space. While we have not yet 
included the effects of redshift space distortion, this is 
a significant step towards extracting Da{z) and H{z) 
from the entire shape of the power spectrum of galaxies. 
We shall address the effect of non-linear redshift space 
distortion in the future work. 

5.1.2. Method: Measuring "Box Size" of the 
Millennium Simulation 

In real space simulations (as opposed to redshift space 
ones), there is only one distance scale in the problem: 

the box size of the simulation, ibox, which is = 
500 Mpc//i for the Millennium simulation. Then, "esti- 
mating the distance scale from the Millennium Simula- 
tion" becomes (equivalent to "estimating Lbox from the 
Millennium Simulation. Eq. (14) now leads: 



Pobs{k, ibox) 



- (true) 



T- (true) 
box j 

Lhox 



(15) 



As we estimate the variance of power spectrum from 
the observed power spectrum, we need to rescale the vari- 
ance when the normalization of the observed power spec- 
trum changes : 



O'Pi(-^box) = 



^2^1 4,(4*™'')^ 

''box / 



- (true) 



(16) 



We estimate Lbox using the likelihood function given 



by 



£(&l,&2,-Po ,-^-box) = Yl 



1 



X exp 



fc*<fema» V27ra-^.(I,box) 



2aUki/a) 



The likelihood function, Eq. (17), still depends upon 
the bias parameters that we wish to eliminate. There- 
fore we marginalize the likelihood function over all the 
bias parameters with flat priors. We obtain (see also 
Appendix B): 



where a = Lbox/ibox''^- 



pOO poo pOO 

-C(ibox)= / dbl db2 dPo Cih,b2,Po,Lbo 

Jo J — 00 J — 00 

(18) 

Hereafter, we shall simply call Lbox as D for 'distance 
scale'. D is closely related to the angular diameter dis- 
tance, Da{z), and the expansion rate, H{z), in real sur- 
veys. (See, §5.1.1) 

5.1.3. Results: Unbiased Extraction of the distance 
scale from the Millennium Simulation 

In Figure 16 we show D{z) / Dtrue{z) estimated from 
the halo, Mgalaxy, and Dgalaxy catalogues at 2: = 1, 2, 
3, 4, 5, and 6. The maximum likelihood values (filled 
circles) and the corresponding l-cr intervals (errorbars), 
as well as the mean of the likelihood (stars) are shown. 
We find D{z) / Dtrue{z) = 1 to within the 1-a errors from 
all of the halo/galaxy catalogues at all redshifts, provided 
that we use Pobs{k) only up to k„iax that has been deter- 
mined unambiguously from the matter power spectrum 
(see Table 1). Not only does this provide a strong sup- 
port for the validity of Eq. (2), but also it provides a 
practical means for extracting D from the full shape of 
the observed galaxy power spectra. 

Despite a small volume of the Millennium Simulation 
and the use of flat priors on the bias parameters upon 
marginalization, we could determine D to about 2.5% 
accuracy. 

In addition, we also find that the error on D hardly de- 
creases even though kmax increases. It is because of the 
degeneracy between D and the bias parameters. In order 
to see how strongly degenerate they are, we calculate cor- 
relations between pairs of parameters {bi,b2,Po,D / Dtme) 
by the Fisher information matrix from Eq. (9). 

Figure 17 shows both one-dimensional marginalized 
constraints and two-dimensional joint marginalized con- 
straints of 2-a range (95.45% CL) for the bias parameters 
and the distance scale. This figure indicates that when 
we include the distance scale, the correlations between 
bias parameters become milder. It is mainly due to the 
correlation between the distance scale and b\ making the 
constraint on hi weaker. On the other hand, the one- 
dimensional marginalized likelihood functions for 62 and 
Po are hardly changed. The remaining degeneracies are 
those between (62, -Po) and (bi,D/ Dtme)- These degen- 
eracies would be broken when we include the information 
from the bispectrum, as the bispectrum will measure bi 
and &2- 

5.1.4. Optimal estimation of the distance scale 

The constraint we find from the previous subsection 
will get better when we include the bispectrum, as the re- 
duced bispectrum provides independent and strong con- 
straints on 61 and 62 (Sefusatti et al. 2006). 

Note that this is the most conservative analysis one can do. In 
reaUty we can use the bispectrum for measuring 61 and 621 which 
would give appropriate priors on them (see § 4.2.4). We shall report 
on the results from this analysis elsewhere. 



Non-Linear Bias, BAOs and Millennium Simulation 



19 



vi:^ 0.95 

Q 

0.90 

■S 1 .05 

1 .00 

3 0.95 
a 

0.90 
S 1.05 

1 .00 
3 0.95 

0.90 













z = 


tl 

6 


hm 


nr 


Dqoloxy 


H 












tl 


rtn 


nr 


t++t1 

Mqalaxy 


H 












ti 


htti 


nr 


halo 



0.90 ; 
'■^ 1 -05 - 



^ 0.95 : 
0.90 ^ 



Dgoloxy 



Q" 1 .00 
3 0.95 
0.90 

3, 1 .05 
Q" 1 .00 
3 0.95 

i^qoioxy ] ° 0.90 



0.95 - 
0.90 L 



0.2 0.4 0.6 0.8 1.0 1.2 1.4 



0.2 0.4 0.6 0.8 1.0 1.2 



. 0.95 
0.90 



Dgoloxy 



Mqoloxy 



0.2 0.4 0.6 O.a 1.0 1.2 1.4 



^ 0.95 

Q 

0.90 

3^1.05 

1 .00 

3 0.95 
o 

0.90 

3_ 1 .05 

1 .00 

3 0.95 
a 

0.90 







z = 3 


Dqcloxy 




1 1 M f 




Mqoloxy 




H t H 




holo 



0.95 
0.90 



I 0.95 
0.90 



I 0.95 
0.90 





www 


z = 2 


Dgolaxy 






Hill 




Mqoloxy 


\\\\\\\\\ 


WWl 




holo 



-,,.05, 
Q 1.00 f- 
3 0.95 - 
0.90 ^_ 

3 1.05 J 
0-1.00^ 



. 0.95 
0.90 



. 0.95 
0.90 



Tmrn 



Uqaloxy 



Mqoloxy 



0.2 0.4 0.6 0.8 ,.0 ,.2 ,.4 
k [h/Nflpc] 



0.2 0.4 0.6 0.8 ,.0 1.2 ,.4 
k [h/Upc] 



0.2 0.4 0.6 0.8 ,.0 ,.2 ,.4 



Fig. 16. — Distance scale extracted from the Millennium Simulation using the 3rd-order PT galaxy power spectrum given by Eq. (2), 
divided by the true value. The mean of the likelihood {stars), and the maximum likelihood values {filled circles) and the corresponding 
l-(7 intervals {errorbars), are shown as a function of maximum wavenumbers used in the fits, kmax- We find D/Dtrue = 1 to within the 
1-(T errors from all the halo/galaxy catalogues ("halo," "Mgalaxy," and "Dgalaxy") at all redshifts, provided that we use kmax estimated 
from the matter power spectra, kmax = 0.15, 0.25, 1.0, 1.2, 1.3, and 1.5 at 2: = 1, 2, 3, 4, 5, and 6, respectively (see Table 1). Note that 
the errors on D do not decrease as kmax increases due to degeneracy between D and the bias parameters. See Figure (18) and (19) for 
further analysis. 




Fig. 17. — Same as Figure 8, but including the distance scale D/Dtrue- 
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Fig. 18. — Same as Figure 16, but with bi and 62 fixed at the best-fitting values. The 1-cr ranges for D are 1.5% and 0.15% for 
kmax = 0.2 /i/Mpc and kmax = 1-5 /i/Mpc, respectively. The errors on D decrease as kmax increases, but the scaling is still milder than 
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Fig. 19. — Same as Figure 16, but with bi, 62 and Pq fixed at the best-fitting values. The l-cr ranges for D are 0.8% and 0.05% for 
kmax = 0.2 h/Mpc and kmax = 1-5 /i/Mpc, respectively. The errors on D decrease as kmax increases as \/'^k<k ^.u: 
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How much will it be better? First, let us assume that 
we know the exact values of bi and 62- In this case, we 
get the error on D by marginalizing only over Pq while 
setting bi and 62 to be the best-fitting values, i.e. 

/oo 
dPoC{bY,b^^,Po,D) (19) 
-00 

where b^ and b^^ denote the best-fitting values of bi and 
62 for each kmax, respectively. In Figure 18, wc show 
D/Dtrue estimated from Eq. (19). This figure shows that 
we can extract D to about 1.5% accuracy even for the 
low k„iax = 0.2 h/Mpc, and the error decreases further to 
Q.15%ioikmax = 1.5/i/Mpc. Note that the uncertainties 
on D/Dtrue decrease as kmax increases as expected. The 
reason is because fixing 61 and 62 breaks the degeneracy 
between them and the distance scale. 

In reality, the bias parameters estimated from the bis- 
pectrum have finite errors, and thus the accuracy of ex- 
tracting D will be somewhere in between Figure 16 and 
Figure 18. The result of the full analysis including both 
power spectrum and bispectrum of Millennium Simula- 
tion will be reported elsewhere. 

In the ideal situation where we completely understand 
the complicated halo/galaxy formation, we may be able 
to calculate the three bias parameters from the first prin- 
ciple. This ideal determination of bias parameters will 
provide more accurate constraints on the distance scale 
D. In this case, we get the likelihood function by fixing 
all the bias parameters to their best-fitting values : 

^fix biasp) ^ ^^^bf ^ ^bf ^ pbf ^ jj^ (20) 

By knowing all the bias parameters, we can extract 
the distance scale D to 0.8% accuracy for kmax = 

0. 2 h/Mpc. The error decreases further to 0.05% for 
kmax = 1.5 /i/Mpc. (See Figure (19)) 

5.1.5. Forecast for a HETDEX-like survey 

The planned future surveys would cover a larger vol- 
ume than the Millennium Simulation. Also, since the 
real surveys would be limited by their continuum/flux 
sensitivity, they would not be able to detect all galax- 
ies that were resolved in the Millennium Simulation. In 
this subsection we explore how the constraints would be 
affected by the volume and the number of objects. 

To simulate the mock data, we take a simplified ap- 
proach: we take our best-fitting power spectrum at z = 3, 

1. e., Eq. (2) fit to the power spectrum of MPA halos in 
the Millennium Simulation at z = 3, and add random 
Gaussian noise to it with the standard deviation given 
by Eq. (7). To compute the standard deviation we need 
to specify the survey volume, which determines the fun- 

1 /3 

damental wavenumber, Ak, as Afc — I-k jVsurvey- We 
use the volume that would be surveyed by the HET- 
DEX survey (HiU et al. 2004), Vsuwey = (1-5 Gpc//i)3, 
which is 27 times as large as the volume of the Millen- 
nium Simulation. We then vary the number of galax- 
ies, N galaxy, which determines the shot noise as Pshot = 
1/n = Vsurvey/N galaxy We havc generated only one re- 
alization, and repeated the same analysis as before to 
extract Da from the mock HETDEX data. 

In Figure 20 we show D/Dtme as a function of kmax 
and Ng. For Ngaiaxy = 10^, which gives the same number 
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Fig. 20. — Projected constraints on D at 2 = 3 from a HETDEX- 
like survey with the survey volume of (1.5 Gpc/h)^. We have used 
the best-fitting 3rd-order PT power spectrum of MPA halos in the 
Millennium Simulation for generating a mock simulation data. We 
show the results for the number of objects of N galaxy = 2 X 10^, 
10®, 2 X 10®, and 10^, from the top to bottom panels, respectively, 
for which we find the projected 1-cr errors of 2.5%, 1.5%, 1%, and 
0.3%, respectively. 

density as the Millennium Simulation, the projected error 
on D is 0.3%, or 8 times better than the original result 
presented in Figure 16. Since the volume is 27 times 
bigger, the statistics alone would reduce the error by a 
factor of about 5. 

The other factor of about 1.5 comes from the fact that 
the variance of the distance scale estimated from the 
Millennium Simulation lies on the tail of the distribu- 
tion of the variance of the distance scale, (See, appendix 
C) while the error estimated from the HETDEX volume 
mock is close to the peak of PDF of the variance. 

However, real surveys will not get as high the number 
density as the Millennium Simulation. For example, the 
HETDEX survey will detect about one million Lya emit- 
ting galaxies, i.e., Ngaiaxy — 10®. In Figure 20 we show 
that the errors on D increase from 0.3% for Ngaiaxy = 10® 
to 1%, 1.5%, and 3% for Ngaiaxy = 2 X 10^ 10^ and 
2 X 10^, respectively. 

Finally, we note that these forecasts are not yet final, 
as we have not included the effect of non-linear rcdshift 
space distortion. Also, eventually one needs to repeat 
this analysis using the "super Millennium Simulation" 
with a bigger volume. 

6. DISCUSSION AND CONCLUSIONS 

Two main new results that we have presented in this 
paper are: 

• The 3rd-order PT galaxy power spectrum given by 
Eq. (2), which is based upon the assumption that 
the number density of galaxies at a given location 
is a local function of the underlying matter density 
at the same location (Fry & Gaztanaga 1993) plus 
stochastic noise (McDonald 2006), fits the halo as 
well as galaxy power spectra estimated from the 
Millennium Simulation at high redshifts, 1 < z < 6, 
up to the maximum wavenumber, kmax, that has 
been determined from the matter power spectrum. 
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• When 3 galaxy bias parameters, 61, 62, and Pq, are 
marginalized over, the 3rd-order PT galaxy power 
spectrum fit to the Millennium Simulation yields 
the correct (unbiased) distance scale to within the 
statistical error of the simulation, ~ 3%. 

These results suggest that the 3rd-order PT provides us 
with a practical means to extract the cosmological infor- 
mation from the observed galaxy power spectra at high 
redshifts, i.e., z > 1, accurately. 

We would like to emphasize that our approach does 
not require simulations to calibrate the model. The 3rd- 
order PT is based upon the solid physical framework, 
and the only assumption made for the galaxy formation 
is that it is a local process, at least on the scales where the 
3rd-order PT is valid, i.e., k < kmax- The only serious 
drawback so far is that the 3rd-order PT breaks down 
at low redshifts, and thus it cannot be applied to the 
current generation of survey data such as 2dFGRS and 
SDSS. However, the planned future high-z surveys would 
benefit immensely from the PT approach. 

The practical application of our approach may proceed 
as follows: 

(1) Measure the galaxy power spectra at various red- 
shifts. When we have N redshift bins, the number 
of bias parameters is 3iV, as the bias parameters 
evolve with z. 



(2) Calculate fc„ 



{z) from the condition, 
0.4, where A^(fc,z) = 
k^Pss{k^z)/{2'K'^) is computed from the fidu- 
cial cosmology, e.g., the WMAP 5-ycar best-fitting 
parameters. The results should not be sensitive to 
the exact values of kmax{z)- 

(3) Fit Eq. (2) to the observed galaxy spectra up to 

kmax{z) at all z simultaneously for extracting the 
cosmological parameters. 

In addition to this, we should be able to improve upon 
the accuracy of parameter determinations by including 
the bispectrum as well, as the bispectrum basically fixes 
61 and 62 (Sefusatti & Komatsu 2007). Therefore, the 
step (3) may be replaced by 



(3') Fit Eq. (2) to the observed galaxy spectra up to 
kmax{z), and fit the PT bispectrum to the observed 
galaxy bispectra up to the same kmax{z), at all z 
simultaneously for extracting the cosmological pa- 
rameters. 

We are currently performing a joint analysis of the galaxy 
power spectra and bispectra on the Millennium Simula- 
tion. The results will be reported elsewhere. 

There are limitations in our present study, however. 
First, a relatively small volume of the Millennium Simu- 
lation docs not allow us to make a precision test of the 
3rd-order PT. Also, this limitation does not allow us to 
study constraints on more than one cosmological param- 
eter. We have picked D as the representative example 
because this parameter seems the most interesting one 
in light of the future surveys whose primary goal is to 
constrain the properties of dark energy. In the future we 
must use larger simulations to show convincingly that 
the bias in cosmological parameters is much lower than 
1% level. Second, we have found that, due to the limited 
statistics of a small volume and the smaller k„iax due to 
stronger non-linearities, the bias parameters are not de- 
termined very well from the galaxy power spectra alone 
&t z <2>. This issue should disappear by including the 
bispectrum in the joint analysis. Last and foremost, our 
study has been restricted to the real space power spec- 
tra: we have not addressed the non-linearities in redshift 
space distortion. This is a subject of the future study. 
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APPENDIX 
ERROR ON POWER SPECTRUM 
Besides the normalization, an estimator for the power spectrum may be written as 



Po,s{k) = —Y,\S{\^^)\ 



(Al) 



|ki-fe|<Afe 



where (5(ki) is a Fourier transform of the density field in position space, Afc is the fundamental wavenumber of either 
survey volume or simulation box, and Nk is the number of independent /c-modes available per bin. This estimator is 
unbiased because 

(n6«(fc)) = — 5^(|5(k.)P) 



= {\5{kt) = P{k), 



(A2) 



|ki-fe|<Afe 



where P{k) is the underlying power spectrum. The variance of this estimator is given by 



' Pobsjk) - pjk y 



Nk Nk 



Non-Linear Bias, BAOs and Millennium Simulation 



23 



Assuming that the density field is a Gaussian random variable with its variance given by P{k), i.e., 
we use Wick's theorem for evaluating the last double summation: 



(A4) 



i=l j=l i=l j=l 

=Ni[p{k)r+Nk[p{k)r. 



[Pobsik) - P{k)f) = 



Therefore, the variance is given by 



and the standard deviation is given by 

CTpik) = (^[Pobsik) - P{k)f) 



1/2 



P{k). 



(A5) 
(A6) 

(A7) 



Note that this formula is valid only when 5 is a Gaussian random field. When 5 is non-Gaussian due to, e.g., non- 
linear evolution, primordial non-Gaussianity, non- linear bias, etc., wc must add the connected four-point function to 
Eq. (A5). See also Takahashi et al. (2008) for the study of finite box size effects on the four-point function. 

How do we calculate A^fe? As the Fourier transformation of a real- valued field has symmetry given by S*{k) = 5{—]s.), 
the number of independent fc-modes is exactly a half of the number of modes available in a spherical shell at a given 
k. We find 

1 ink^Sk ( k 

Sk 



2 (<5fc)3 

where 6k is the fundamental wavenumber given by 5k = 2'k/L, where L is the survey size or simulation box size. 
In the literature one may often find a different formula such as 



(A8) 



..literature 



js^literature 
k 



p{k). 



(A9) 



Here, there is an extra factor of \/2, as Nj^^^^^t^re jg ^j^g number of modes available in a spherical shell at a given k, 
without symmetry, 5*(k) = k), taken into account, i.e., ]\[j^terature _ 2Nk. Both formulas give the same results, 
provided that we understand what we mean by Nk in these formulas. 

We have tested the formula Eq. (A7) by comparing it to the standard deviation estimated the ensemble of dark 
matter simulations used in Paper I. (See Paper I for details of the simulations.) Figure Al and A2 show the result of 
this comparison. The formula Eq. (A7) and the simulation data agree well. 

ANALYTICAL MARGINALIZATION OF THE LIKELIHOOD FUNCTION OVER Bf AND Pq 

In this appendix we derive the analytical formulas for the likelihood function marginalized over bf and Pq. 
The likelihood function, Eq. (6), is given by 

[Pobs,^ - bl{Pss,^ + 62^2,* + 6^22,*) " Pq ^ 



jC{bub2,Po,en) 



n 



exp 



(Bl) 



where 9n arc the cosmological parameters that do not depend on any of the bias parameters. The subscript i denotes 
bins, ki. 

Integrating the likelihood function over Pq, we find 



/oo 
dPoC(bi,b2,Po,e„) 
-00 



=A/-, 



/ 2n 



exp 



1 E»>j- w»Wj-(aj- - a^Y 

2 EiWi 



where we have defined new variables 



oos,t 

1 



b2Pb2,i + blPb22,i) 



(B2) 

(B3) 

(B4) 
(B5) 
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Fig. A1. — Standard deviation of the matter power spectrum: analytical versus simulations. The symbols show the standard deviations 
directly measured from 120 independent A^-body simulations whose box sizes are L = 512 Mpc/h (60 realizations for k < 0.24/i/Mpc) 
and L = 256 Mpc/h (60 realizations for 0.24 < k < 0.5?i/Mpc) . Each simulation contains 256"^ particles. The solid and dot-dashed lines 
show the analytical formula (Eq. (A7)) with the 3rd-order PT non-linear P{k) and the linear P{k), respectively. Note that the graph is 
discontinuous at fc = 0.24/i/Mpc because the number of k modes, A'^^, for a given wavenumber k is different for different box sizes. 
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Fig. A2. — Residuals. We divide both analytical estimation and simulation results by the analytical formula (Eq. (A7)) with the 3rd-order 
PT nonlinear P{k). 
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We then integrate this function over 6^. Introducing new variables given by 



I 27r 



Pth,i = PsS,i + b2Pb2,i + 62^622,8 
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(B6) 

(B7) 



and Gi = Pobs,i — b\Pth,i, we rewrite Eq. (B2) as 



=J\fexp 



1 Ei>j WiWj \^{Pth,i - Pth,j)bi - {Pobs,i - Pobsj)] 



l(Abt-2Bbl + c) 



where 



A = 

B 

C = 



_ J2i>j WiWj{Pth,i - Pth,if 



J2i>j WiWj{Pth,i - Pth,j){Pobs,i - Pobsj) 
_ J2i>j WiWj{Pobs,i - Pobsjf 



(B8) 

(B9) 
(BIO) 

(Bll) 



Assuming a flat prior on bf, we integrate the likelihood function to find the desired result: 



C(b2,6n)=M exp -]^[Ab\-2Bb\ + C 



=7Vexp 



£2 _ AC 



2A 



d{bl) 



;^il + erff^) 
2A\ \V2AJ 




(B12) 



Note that the convergence of the likelihood function is ensured by Cauchy's inequality, B"^ — AC < 0. 

DISTRIBUTION OF ERRORS ON THE DISTANCE SCALE 

We find that the error on D extracted from the halo power spectrum of Millennium Simulation is about 2.17% 
for kmax = 1.5 h/Mpc at z = 6. (See Figure 16.) On the other hand, the error on D calculated from the Fisher 
information matrix is 1.57% Are they consistent? 

In order to test whether it is possible to get the error on D far from the value derived from the Fisher matrix, we 
generate 1000 realizations of mock power spectra with the best-fitting bias parameters for halo with kmax = 1-5 /i/Mpc 
at 2: = 6. Then, we calculate the best-fitting value of D as well as the 1-a (68.27% CL) range from the one-dimensional 
marginalized likelihood function of D for each realization. 

We find that the mean 1-a error on D calculated from these realizations is 1.66%, and their standard deviation 
is 0.43%. Figure CI shows the distribution of the fractional 1-a error on D compared with Dtme- While the error 
derived from the Fisher matrix is close to the mean, the error calculated from the Millennium Simulation is on the 
tail of the distribution. The probability of having an error on D greater than that from the Millennium Simulation is 
about 6%, which is acceptable. 
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